112 research outputs found
A Nonlinear Mixture Autoregressive Model For Speaker Verification
In this work, we apply a nonlinear mixture autoregressive (MixAR) model to supplant the Gaussian mixture model for speaker verification. MixAR is a statistical model that is a probabilistically weighted combination of components, each of which is an autoregressive filter in addition to a mean. The probabilistic mixing and the datadependent weights are responsible for the nonlinear nature of the model. Our experiments with synthetic as well as real speech data from standard speech corpora show that MixAR model outperforms GMM, especially under unseen noisy conditions. Moreover, MixAR did not require delta features and used 2.5x fewer parameters to achieve comparable or better performance as that of GMM using static as well as delta features. Also, MixAR suffered less from overitting issues than GMM when training data was sparse. However, MixAR performance deteriorated more quickly than that of GMM when evaluation data duration was reduced. This could pose limitations on the required minimum amount of evaluation data when using MixAR model for speaker verification
Lexical Speaker Error Correction: Leveraging Language Models for Speaker Diarization Error Correction
Speaker diarization (SD) is typically used with an automatic speech
recognition (ASR) system to ascribe speaker labels to recognized words. The
conventional approach reconciles outputs from independently optimized ASR and
SD systems, where the SD system typically uses only acoustic information to
identify the speakers in the audio stream. This approach can lead to speaker
errors especially around speaker turns and regions of speaker overlap. In this
paper, we propose a novel second-pass speaker error correction system using
lexical information, leveraging the power of modern language models (LMs). Our
experiments across multiple telephony datasets show that our approach is both
effective and robust. Training and tuning only on the Fisher dataset, this
error correction approach leads to relative word-level diarization error rate
(WDER) reductions of 15-30% on three telephony datasets: RT03-CTS, Callhome
American English and held-out portions of Fisher.Comment: Accepted at INTERSPEECH 202
Device Directedness with Contextual Cues for Spoken Dialog Systems
In this work, we define barge-in verification as a supervised learning task
where audio-only information is used to classify user spoken dialogue into true
and false barge-ins. Following the success of pre-trained models, we use
low-level speech representations from a self-supervised representation learning
model for our downstream classification task. Further, we propose a novel
technique to infuse lexical information directly into speech representations to
improve the domain-specific language information implicitly learned during
pre-training. Experiments conducted on spoken dialog data show that our
proposed model trained to validate barge-in entirely from speech
representations is faster by 38% relative and achieves 4.5% relative F1 score
improvement over a baseline LSTM model that uses both audio and Automatic
Speech Recognition (ASR) 1-best hypotheses. On top of this, our best proposed
model with lexically infused representations along with contextual features
provides a further relative improvement of 5.7% in the F1 score but only 22%
faster than the baseline
Peek into the Future Camera-based Occupant Sensing in Configurable Cabins for Autonomous Vehicles
The development of fully autonomous vehicles (AVs) can potentially eliminate
drivers and introduce unprecedented seating design. However, highly flexible
seat configurations may lead to occupants' unconventional poses and actions.
Understanding occupant behaviors and prioritize safety features become
eye-catching topics in the AV research frontier. Visual sensors have the
advantages of cost-efficiency and high-fidelity imaging and become more widely
applied for in-car sensing purposes. Occlusion is one big concern for this type
of system in crowded car cabins. It is important but largely unknown about how
a visual-sensing framework will look like to support 2-D and 3-D human pose
tracking towards highly configurable seats. As one of the first studies to
touch this topic, we peek into the future camera-based sensing framework via a
simulation experiment. Constructed representative car-cabin, seat layouts, and
occupant sizes, camera coverage from different angles and positions is
simulated and calculated. The comprehensive coverage data are synthesized
through an optimization process to determine the camera layout and overall
occupant coverage. The results show the needs and design of a different number
of cameras to fully or partially cover all the occupants with changeable
configurations of up to six seats.Comment: Conference: 2021 IEEE International Intelligent Transportation
Systems Conference (ITSC) Link: https://ieeexplore.ieee.org/document/956442
Pests and predators of oysters
In all aquaculture practices the detrimental effects of
cohabiting organisms are either by predation, competition,
disease or parasitism. Hanson (1974) stated that
limited predation can serve to weed out some diseased
members of a crop and also help in controlling epizootic
infections. But large-scale mortalities result in economic
loss by reduction in the tended stock. Control of
predation also means additional expense on the production
cost (Mackenzie, 1970a). While evolving culture
methods for fish or shellfish, identifying and proper
use of methods to prevent and control numerous
predators of cultivable organisms is absolutely essential
to maximise production
Group III PLA2 from the scorpion, Mesobuthus tamulus: cloning and recombinant expression in E. coli
Phospholipases A2 (PLA2) are enzymes that specifically hydrolyze the sn-2 fatty acid acyl bond of phospholipids, producing a free fatty acid and a lyso-phospholipid. We report the cloning and expression of a secretory phospholipase A2 (sPLA2) from Mesobuthus tamulus, Indian red scorpion. The nucleotide sequence codes for a 167 residue enzyme. The open reading frame codes for a 31 amino acid signal peptide followed by a mature portion of the protein. The primary structure shows the calcium binding motif, catalytic residues, 8 highly-conserved cysteines and C-terminal extension which classify it as a group III PLA2. The entire transcript was expressed in Escherichia coli and was purified by metal affinity chromatography under denaturing conditions. The protein was refolded by serial dilutions in the refolding buffer to its active form. Hemolytic assays indicate that the protein adopts a functional conformation. The functional requisites such as optimum pH of 8 and calcium dependency are shown. This report provides a simple but robust methodology for recombinant expression of toxic proteins
Identification of 4 New Loci Associated With Primary Hyperparathyroidism (PHPT) and a Polygenic Risk Score for PHPT
CONTEXT: A hypothesis-free genetic association analysis has not been reported for patients with primary hyperparathyroidism (PHPT). OBJECTIVE: We aimed to investigate genetic associations with PHPT using both genome-wide association study (GWAS) and candidate gene approaches. METHODS: A cross-sectional study was conducted among patients of European White ethnicity recruited in Tayside (Scotland, UK). Electronic medical records were used to identify PHPT cases and controls, and linked to genetic biobank data. Genetic associations were performed by logistic regression models and odds ratios (ORs). The combined effect of the genotypes was researched by genetic risk score (GRS) analysis. RESULTS: We identified 15 622 individuals for the GWAS that yielded 34 top single-nucleotide variations (formerly single-nucleotide polymorphisms), and LPAR3-rs147672681 reached genome-wide statistical significance (P = 1.2e-08). Using a more restricted PHPT definition, 8722 individuals with data on the GWAS-identified loci were found. Age- and sex-adjusted ORs for the effect alleles of SOX9-rs11656269, SLITRK5-rs185436526, and BCDIN3D-AS1-rs2045094 showed statistically significant increased risks (P < 1.5e-03). GRS analysis of 5482 individuals showed an OR of 2.51 (P = 1.6e-04), 3.78 (P = 4.0e-08), and 7.71 (P = 5.3e-17) for the second, third, and fourth quartiles, respectively, compared to the first, and there was a statistically significant linear trend across quartiles (P < 1.0e-04). Results were similar when stratifying by sex. CONCLUSION: Using genetic loci discovered in a GWAS of PHPT carried out in a Scottish population, this study suggests new evidence for the involvement of genetic variants at SOX9, SLITRK5, LPAR3, and BCDIN3D-AS1. It also suggests that male and female carriers of greater numbers of PHPT-risk alleles both have a statistically significant increased risk of PHPT
- …